Scaling and percolation in the small-world network model 



M. E. J. Newman and D. J. Watts 
Santa Fe Institute, 1399 Hyde Park Road, Santa Fe, NM 87501 
(May 6, 1999) 

In this paper we study the smaU-world network model of Watts and Strogatz, which mimics some 
aspects of the structure of networks of social interactions. We argue that there is one non-trivial 
length-scale in the model, analogous to the correlation length in other systems, which is well-defined 
in the limit of infinite system size and which diverges continuously as the randomness in the network 
tends to zero, giving a normal critical point in this limit. This length-scale governs the cross-over 
from large- to small-world behavior in the model, as well as the number of vertices in a neighborhood 
of given radius on the network. We derive the value of the single critical exponent controlling 
behavior in the critical region and the finite size scaling form for the average vertex-vertex distance 
on the network, and, using series expansion and Pade approximants, find an approximate analytic 
form for the scaling function. We calculate the effective dimension of small-world graphs and show 
that this dimension varies as a function of the length-scale on which it is measured, in a manner 
reminiscent of multifractals. We also study the problem of site percolation on small-world networks 
as a simple model of disease propagation, and derive an approximate expression for the percolation 
probability at which a giant component of connected vertices first forms (in epidemiological terms, 
the point at which an epidemic occurs). The typical cluster radius satisfies the expected finite size 
scaling form with a cluster size exponent close to that for a random graph. All our analytic results 
are confirmed by extensive numerical simulations of the model. 
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I. INTRODUCTION 

Networks of social interactions between individuals, 
groups, or organizations have some unusual topological 
properties which set them apart from most of the net- 
works with which physics deals. They appear to display 
simultaneously properties typical both of regular lattices 
and of random graphs. For instance, social networks 
have well-defined locales in the sense that if individual A 
knows individual B and individual B knows individual C, 
then it is likely that A also knows C — much more likely 
than if we were to pick two individuals at random from 
the population and ask whether they are acquainted. In 
this respect social networks are similar to regular lattices, 
which also have well-defined locales, but very different 
from random graphs, in which the probability of connec- 
tion is the same for any pair of vertices on the graph. 
On the other hand, it is widely believed that one can get 
from almost any member of a social network to any other 
via only a small number of intermediate acquaintances, 
the exact number typically scaling as the logarithm of 
the total number of individuals comprising the network. 
Within the population of the world, for example, it has 
been suggested that there are only about "six degrees 
of separation" between any human being and any other 
101 . This behavior is not seen in regular lattices but is a 
well-known property of random graphs, where the aver- 
age shortest path between two randomly-chosen vertices 
scales as log TV/ log 2, where N is the total number of 
vertices in the graph and z is the average coordination 
number 0. 



Recently, Watts and Strogatz have proposed a 
model which attempts to mimic the properties of social 
networks. This "small-world" model consists of a net- 
work of vertices whose topology is that of a regular lat- 
tice, with the addition of a low density </> of connections 
between randomly-chosen pairs of vertices [Q. Watts 
and Strogatz showed that graphs of this type can indeed 
possess well-defined locales in the sense described above 
while at the same time possessing average vertex-vertex 
distances which are comparable with those found on true 
random graphs, even for quite small values of 0. 

In this paper we study in detail the behavior of the 
small- world model, concentrating particularly on its scal- 
ing properties. The outline of the paper is as follows. In 
Section |^ we define the model. In Section III we study 
the typical length-scales present in the model and argue 
that the model undergoes a continuous phase transition 
as the density of random connections tends to zero. We 
also examine the cross-over been large- and small- world 
behavior in the model, and the structure of "neighbor- 
hoods" of adjacent vertices. In Section IV we derive a 



scaling form for the average vertex-vertex distance on a 
small- world graph and demonstrate numerically that this 
form is followed over a wide range of the parameters of 
the model. In Section ^ we calculate the effective dimen- 
sion of small-world graphs and show that this dimension 
depends on the length-scale on which we examine the 
graph. In Section VI we consider the properties of site 



percolation on these systems, as a model of the spread of 
informatio n or disease through social networks. Finally, 
in Section VI] we give our conclusions. 
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FIG. 1. (a) An example of a small-world graph with 
L = 24, k = 1 and, in this case, four shortcuts, (b) An 
example with fc = 3. 

II. THE SMALL- WORLD MODEL 

The original small-world model of Watts and Strogatz, 
in its simplest incarnation, is defined as follows. We take 
a one-dimensional lattice of L vertices with connections 
or bonds between nearest neighbors and periodic bound- 
ary conditions (the lattice is a ring) . Then we go through 
each of the bonds in turn and independently with some 
probability "rewire" it. Rewiring in this context means 
shifting one end of the bond to a new vertex chosen 
uniformly at random from the whole lattice, with the 
exception that no two vertices can have more than one 
bond running between them, and no vertex can be con- 
nected by a bond to itself. In this model the average 
coordination number z remains constant (z = 2) dur- 
ing the rewiring process, but the coordination number of 
any particular vertex may change. The total number of 
rewired bonds, which we will refer to as "shortcuts", is 
<j)L on average. 

For the purposes of analytic treatment the Watts- 
Strogatz model has a number of problems. One problem 
is that the distribution of shortcuts is not completely uni- 
form; not all choices of the positions of the rewired bonds 
are equally probable. For example, configurations with 
more than one bond between a particular pair of vertices 
are explicitly forbidden. This non-uniformity of the dis- 
tribution makes an average over different realizations of 
the randomness hard to perform. 

A more serious problem is that one of the crucial quan- 
tities of interest in the model, the average distance be- 
tween pairs of vertices on the graph, is poorly defined. 
The reason is that there is a finite probability of a por- 
tion of the lattice becoming detached from the rest in 
this model. Formally, we can represent this by saying 
that the distance from such a portion to a vertex else- 
where on the lattice is infinite. However, this means that 
the average vertex-vertex distance on the lattice is then 
itself infinite, and hence that the vertex-vertex distance 
averaged over all realizations is also infinite. For numeri- 
cal studies such as those of Watts and Strogatz this does 
not present any substantial difficulties, but for analytic 



FIG. 2. (a) An example of a = 1 small-world graph with 
an underlying lattice of dimension d = 2. (b) The pattern of 
bonds around a vertex on the d = 2 lattice for fc = 3. 

work it results in a number of quantities and expressions 
being poorly defined. 

Both of these problems can be circumvented by a 
slight modification of the model. In our version of the 
small-world model we again start with a regular one- 
dimensional lattice, but now instead of rewiring each 
bond with probability cf), we add shortcuts between pairs 
of vertices chosen uniformly at random but we do not 
remove any bonds from the regular lattice. We also ex- 
plicitly allow there to be more than one bond between 
any two vertices, or a bond which connects a vertex to 
itself. In order to preserve compatibility with the results 
of Watts and Strogatz and others, we add with probabil- 
ity (f) one shortcut for each bond on the original lattice, 
so that there are again (j)L shortcuts on average. The av- 
erage coordination number is z — 2(1 + 0). This model 
is equivalent to the Watts-Strogatz model for small 4>, 
whilst being better behaved when (p becomes compara- 
ble to 1. Fig. |l](a) shows one realization of our model for 
L = 24. 

Real social networks usually have average coordination 
numbers z significantly higher than 2, and we can arrange 
for higher z in our model in a number of ways. Watts 
and Strogatz proposed adding bonds to next-nearest 
or further neighbors on the underlying one-dimensional 
lattice up to some fixed range which we will call fc |^ . In 
our variation on the model we can also start with such a 
lattice and then add shortcuts to it. The mean number 
of shortcuts is then (f>kL and the average coordination 
number is z = 2fc(l + 0). Fig. |l|(b) shows a realization of 
this model for k — 3. 

Another way of increasing the coordination number, 
suggested first by Watts , is to use an underlying lat- 
tice for the model with dimension greater than one. In 
this paper we will consider networks based on square and 
(hyper)cubic lattices in d dimensions. We take a lattice 
of linear dimension L, with L"* vertices, nearest-neighbor 
bonds and periodic boundary conditions, and add short- 
cuts between randomly chosen pairs of vertices. Such a 
graph has (fxiU^ shortcuts and an average coordination 
number z = 2c?(l -|- 0). An example is shown in Fig. ||(a) 
for d — 2. We can also add bonds between next-nearest or 
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further neighbors to such a lattice. The most straightfor- 
ward generaUzation of the one-dimensional case is to add 
bonds along the principal axes of the lattice up to some 
fixed range k, as shown in Fig. ||(b) for fc = 3. Graphs of 
this type have ipkdL'^ shortcuts on average and a mean 
coordination number of z = 2kd{l + (f). 

Our main interest in this paper is with the properties 
of the small- world model for small values of the shortcut 
probability 4>. Watts and Strogatz |^ found that the 
model displays many of the characteristics of true random 
graphs even for ^ <C 1, and it seems to be in this regime 
that the model's properties are most like those of real- 
world social networks. 



III. LENGTH-SCALES IN SMALL- WORLD 
GRAPHS 

A fundamental observable property of interest on 
small-world lattices is the shortest path between two 
vertices — the number of degrees of separation — measured 
as the number of bonds traversed to get from one vertex 
to another, averaged over all pairs of vertices and over 
all realizations of the randomness in the model. We de- 
note this quantity i. On ordinary regular lattices i scales 
linearly with the lattice size L. On the underlying lat- 
tices used in the models described here for instance, it 
is equal to jdL/k. On true random graphs, in which 
the probability of connection between any two vertices 
is the same, £ is proportional to logiV/logz, where N is 
the number of vertices on the graph . The small- world 
model interpolates between these extremes, showing lin- 
ear scaling £ ~ L for small (f), or on systems small enough 
that there are very few shortcuts, and logarithmic scaling 
£ ~ logiV = dlogL when ^ or L is large enough. In this 
section and the following one we study the nature of the 
cross-over between these two regimes, which we refer to 
as "large-world" and "small-world" regimes respectively. 
For simplicity we will work mostly with the case k — I, 
although we will quote results for fc > 1 where they are 
of interest. 

When k = 1 the small-world model has only one in- 
dependent parameter — the probability (j) — and hence can 
have only one non-trivial length-scale other than the lat- 
tice constant of the underlying lattice. This length-scale, 
which we will denote ^, can be defined in a number of 
different ways, all definitions being necessarily propor- 
tional to one another. One simple way is to define ^ to 
be the typical distance between the ends of shortcuts on 
the lattice. In a one-dimensional system with k — 1, for 
example, there are on average (j)L shortcuts and there- 
fore 2(f>L ends of shortcuts. Since the lattice has L ver- 
tices, the average distance between ends of shortcuts is 
L/(2(j)L) = 1/(20). In fact, it is more convenient for 
our purposes to define ^ without the factor of 2 in the 
denominator, so that = 1/0, or for general d 



For fc > 1 the appropriate generalization is 



^ {(^kdy/'i' ^ ' 

As we see, ^ diverges as ^ according to |9| 

(3) 

where the exponent t is 

A number of authors have previously considered a di- 
vergence of the kind described by Eq. (|^) with ^ defined 
not as the typical distance between the ends of short- 
cuts, but as the system size L at which the cross-over 
from large- to small- world scaling occurs |p^|-[l3||. We 
will shortly argue that in fact the length-scale ^ defined 
here is precisely equal to this cross-over length, and hence 
that these two divergences are the same. 

The quantity ^ plays a role similar to that of the corre- 
lation length in an interacting system in standard statis- 
tical physics. Its leaves the system with no length-scale 
other than the lattice spacing, so that at long distances 
we expect all spatial distributions to be scale-free. This 
is precisely the behavior one sees in an interacting sys- 
tem undergoing a continuous phase transition, and it is 
reasonable to regard the small-world model as having a 
continuous phase transition at this point. Note that the 
transition is a one-sided one since is a probability and 
cannot take values less than zero. In this respect the 
transition is similar to that seen in the one-dimensional 
Ising model, or in percolation on a one-dimensional lat- 
tice. The exponent r plays the part of a critical exponent 
for the system, similar to the correlation length exponent 
V for a thermal phase transition. 

De Menezes et al. have argued that the length-scale 
^ can only be defined in terms of the cross-over point be- 
tween large- and small-world behavior, that there is no 
definition of ^ which can be made consistent in the limit 
of large system size. For this reason they argue that 
the transition at = should be regarded as first-order 
rather than continuous. In fact however, the arguments 
of de Menezes et al. show only that one particular defini- 
tion of ^ is inconsistent; they show that ^ cannot be con- 
sistently defined in terms of the mean vertex-vertex dis- 
tance between vertices in finite regions of infinite small- 
world graphs. This does not prove that no definition 
of 1^ is consistent in the L — > oo limit and, as we have 
demonstrated here, consistent definitions do exist. Thus 
it seems appropriate to consider the transition at = 
to be a continuous one. 

Barthelemy and Amaral pO[ | have conjectured on the 
basis of numerical simulations that t — ^ for d — 1. As 
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we have shown here, r is in fact equal to 1 /d, and specifi- 
caUy r = 1 in one dimension. We have also demonstrated 
this result previously using a renormalization group (RG) 
argument |12f| , and it has been confirmed by extensive 
numerical simulations Jll|-|l3t. 

The length-scale ^ governs a number of other proper- 
ties of small-world graphs. First, as mentioned above, 
it defines the point at which the average vertex-vertex 
distance i crosses over from linear to logarithmic scal- 
ing with system size L. This statement is necessarily 
true, since ^ is the only non-trivial length scale in the 
model, but we can demonstrate it explicitly by noting 
that the linear scaling regime is the one in which the 
average number of shortcuts on the lattice is small com- 
pared with unity and the logarithmic regime is the one in 
which it is large The cross-over occurs in the region 
where the average number of shortcuts is about one, or 
in other words when <^kdL'^ — 1. Rearranging for L, the 
cross-over length is 



L = 



1 



(5) 



The length-scale ^ also governs the average number 
V(r) of neighbors of a given vertex within a neighborhood 
of radius r. The number of vertices in such a neighbor- 
hood increases as r'^ for r ^ ^ while for r ^ the graph 
behaves as a random graph and the size of the neigh- 
borhood must increase exponentially with some power of 
r/^. To derive the specific functional form of V{r) we 
consider a small-world graph in the limit of infinite L. 
Let a(r) be the surface area of a "sphere" of radius r on 
the underlying lattice of the model, i.e., it is the number 
of points which are exactly r steps away from any ver- 
tex. (For fc = 1, a(r) = /r{d) when r > 1.) The 
volume within a neighborhood of radius r in an infinite 
system is the sum of a{r) over r, plus a contribution of 
V{r — r') for every shortcut encountered at a distance r', 
of which there are on average 2^~'^a{r'). Thus V{r) is in 
general the solution of the equation 



(6) 



In one dimension with fc = 1, for example, a{r) = 2 for 
all r and, approximating the sum with an integral and 
then differentiating with respect to r, we get 



^ = 2[l + 2y(r)/e]. 



which has the solution 



Vir) = Ui^'^/i - 1). 



(7) 



(8) 



Note that for r ^ ^ this scales as r, independent of f, 
and for r ^ ^ it grows exponentially, as expected. Eq. (g) 
also implies that the surface area of a sphere of radius r 
on the graph, which is the derivative of V{r), should be 
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FIG. 3. The mean surface area A{r) of a neighborhood of 
radius r on a d = 1 small- world graph with (f> = 0.01 for 
L — 128 ... 131 072 (solid lines). The measurements are av- 
eraged over 1000 realizations of the system each. The dotted 
line is the theoretical result for L = oo, Eq. (tJ). 



A{r) = 2e''''/«. 



(9) 



These results are easily checked numerically and give us 
a simple independent measurement of ^ which we can 
use to confirm our earlier arguments. In Fig. |^ we show 
curves of A{r) from computer simulations of systems with 
(j) — 0.01 for values of L equal to powers of two from 128 
up to 131 072 (solid hues). The dotted hue is Eq. (|) with 
^ taken from Eq. (0) . The convergence of the simulation 
results to the predicted exponential form as the system 
size grows confirms our contention that ^ is well-defined 
in the limit of large L. Fig. |shows A{r) for L = 100 000 
for various values of (fi. Eq.^) implies that the slope of 
the lines in the limit of small r is 4/^. In the inset we 
show the values of ^ extracted from fits to the slope as a 
function of (f> on logarithmic scales, and a straight-line fit 
to these points gives us an estimate of r = 0.99 ±0.01 for 
the exponent governing the transition at </> = (Eq. (||)). 
This is in good agreement with our theoretical prediction 
that T = 1. 



IV. SCALING IN SMALL- WORLD GRAPHS 

Given the existence of the single non-trivial length- 
scale ^ for the small-world model, we can also say how 
the mean vertex-vertex distance i should scale with sys- 
tem size and other parameters near the phase transition. 
In this regime the dimensionless quantity i/L can be a 
function only of the dimensionless quantity L/^, since 
no other dimensionless combinations of variables exist. 
Thus we can write 



(10) 
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FIG. 4. The mean surface area A{r) of a neighborhood of 
radius r on a d = 1 small-world graph with L — 100 000 for 
(j> = 10"'' . . . 10~^. The measurements are averaged over 1000 
realizations of the system each. Inset: the value of ^ extracted 
from the curves in the main figure, as a function of 0. The 
gradient of the line gives the value of the exponent t, which is 
found by a least squares fit (the dotted line) to be 0.99 ±0.01. 



where f{x) is an unknown but universal scaling function. 
A scaling form similar to this was suggested previously 
by Barthelemy and Amaral [ p^ on empirical grounds. 
Substituting from Eq. (|^) , we then get for the fc = 1 case 



(11) 



(We have absorbed a factor of d}^"^ into the definition 
of /(x) here to make it consistent with the definition 
we used in Ref. [^.) The usefulness of this equation 
derives from the fact that the function f{x) contains no 
dependence on or L other than the explicit dependence 
introduced through its argument. Its functional form can 
however change with dimension d and indeed it does. In 
order to obey the known asymptotic forms of £ for large 
and small systems, the scaling function f(x) must satisfy 



and 



fix) 



fix) 



loe 



Id 



as X 



as X 



oo, 



0. 



(12) 



(13) 



When k > 1, £ tends to jdL/k for small values of L and 
^ is given by Eq. (^ , so the appropriate generalization of 
the scaling form is 



£ = rfm'/'L)^ 



(14) 



with /(x) taking the same limiting forms ( |12[ ) and (13). 
Previously we derived this scaling form in a more rigorous 
way using an RG argument [|2| . 
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FIG. 5. Data collapse for numerical measurements of the 
mean vertex-vertex distance on small-world graphs with 
d = 1. Circles and squares are results for k = 1 and k = 5 
respectively for values of L between 128 and 32 768 and values 
of (j) between 1 x 10~^ and 3 x lO"'^. Each point is averaged 
over 1000 realizations of the randomness. In all cases the 
errors on the points are smaller than the points themselves. 
The dashed line is the second-order series approximation with 
exact coefficients given in Eq. (^^, while the dot-dashed line 
is the fifth-order approximation using numerical results for 
the last three coefficients. The solid line is the third-order 
Fade approximant, Eqs. (|^) and (^^. Inset: data collapse 
for two-dimensional systems with k — 1 for values of L from 
64 to 1024 and (f) from 3 x 10"*^ up to 1 x 10"^. 



We can again test these results numerically by measur- 
ing £ on small-world graphs for various values of 0, k and 
L. Eq. ( p^ implies that if we plot the results on a graph 
of £k/L against {(j)k)^^'^L, they should collapse onto a 
single curve for any given dimension d. In Fig. |^ we have 
done this for systems based on underlying lattices with 
d = 1 for a range of values of </> and L, for fc = 1 and 5. 
As the figure shows, the collapse is excellent. In the inset 
we show results for d = 2 with A; = 1, which also collapse 
nicely onto a single curve. The lower limits of the scaling 
functions in each case are in good agreement with our 



1 and i for d : 



theoretical predictions of -j for d : 

We are not able to solve exactly for the form of the 
scaling function fix), but we can express it as a series 
expansion in powers of as follows. Since the scaling 
function is universal and has no implicit dependence on 
k, it is adequate to calculate it for the case /c = 1; its 
form is the same for all other values of k. For k = 1 the 
probability of having exactly m shortcuts on the graph 
is 



p — 



dL" 



0"(1 - 



(15) 
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1/4 
5/24 
131/720 
0.1549 ± 0.0003 
0.1365 ± 0.0003 
0.1232 ±0.0003 



TABLE I. Average vertex-vertex distances per vertex 
Im/L on d = 1 small- world graphs with exactly m shortcuts 
and fc = 1. Values up to m = 2 are the exact results of Strang 
and Eriksson [Q . Values for m = 3 ... 5 are our numerical 
results. 



Let Irn be the mean vertex-vertex distance on a graph 
with m shortcuts in the limit of large L, averaged over all 
such graphs. Then the mean vertex- vertex distance aver- 
aged over all graphs regardless of the number of shortcuts 
is 



V P i 



(16) 



Note that in order to calculate £ up to order we only 
need to know the behavior of the model when it has m 
or fewer shortcuts. For the d = 1 case the values of 
the im have been calculated up to m = 2 by Strang and 
Eriksson jllj and are given in Table |. Substituting these 
into Eq. dim and collecting terms in 4>, we then find that 
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11 

1440 



02^ + 0(03). (17) 



Using these values we have evaluated the scaling func- 
tion f{x) up to fifth order in x; the result is shown as 
the dot-dashed line in Fig. |[ As we can see the range 
over which it matches the numerical results is greater 
than before, but not by much, indicating that the se- 
ries expansion converges only slowly as extra terms are 
added. It appears therefore that series expansion would 
be a poor way of calculating f{x) over the entire range 
of interest. 

A much better result can be obtained by using our se- 
ries expansion coefficients to define a Fade approximant 
to f{x) lljjl^. Since we know that f{x) tends to a con- 
stant /(O) — jd for small x and falls off approximately 
as 1/x for large x, the appropriate Fade approximants to 
use are odd-order approximants where the approximant 
of order 2n + 1 {n integer) has the form 



Bn+l{x) ■ 



(20) 



where An{x) and Bn{x) are polynomials in x of degree 
n with constant term equal to 1. For example, to third 
order we should use the approximant 



1 -I- aix 



1 -I- bix + b2x'^ 
Expanding about x = this gives 



(21) 



/(O) 



1 + (ai - bi)x + {hi — aibi — 62)^ 
+ [iai ~ bi)ibl ~ 62) + &i&2]a:' + 0{x^). 



(22) 



The term in (p'^L can be dropped when L is large or (f) 
small, since it is negligible by comparison with at least 
one of the terms before it. Thus the scaling function is 



1440" 



0(.t3). 



(18) 



This form is shown as the dotted line in Fig. ^ and agrees 
well with the numerical calculations for small values of 
the scaling variable x, but deviates badly for large values. 

Calculating the exact values of the quantities £m for 
higher orders is an arduous task and probably does not 
justify the effort involved. However, we have calculated 
the values of the £m numerically up to m = 5 by eval- 
uating the average vertex-vertex distance £ on graphs 
which are constrained to have exactly 3, 4 or 5 shortcuts. 
Ferforming a Taylor expansion of £/L about L = 00, we 
get 



£. 



1 



c 
L 



0{L~ 



(19) 



where c is a constant. Thus we can estimate £m/L from 
the vertical-axis intercept of a plot of £/ L against L^^ for 
large L. The results are shown in Table |. Calculating 
higher orders still would be straightforward. 



Equating coefficients order by order in x and solving for 
the a's and &'s, we find that 



ai = 1.825 ± 0.075, 
bi = 1.991 ±0.075, 
&2 = 0.301 ±0.012. 



(23) 



Substituting these back into (21) and using the known 
value of /(O) then gives us our approximation to fix). 
This approximation is plotted as the solid line in Fig. |5| 
and, as the figure shows, is an excellent guide to the value 
of f{x) over a large range of x. In theory it should be pos- 
sible to calculate the fifth-order Fade approximant using 
the numerical results in Table |, although we have not 
done this here. Substituting f{x) back into the scaling 
form, Eq. (^4|), we can also use the Fade approximant 
to predict the value of the mean vertex-vertex distance 
for any values of 0, k and L within the scaling regime. 
We will make use of this result in the next section to 
calculate the effective dimension of small- world graphs. 
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V. EFFECTIVE DIMENSION 

The calculation of the volumes and surface areas of 
neighborhoods of vertices on small-world graphs in Sec- 
tion VU leads us naturally to the consideration of the 
dimension of these systems. On a regular lattice of di- 
mension D, the volume V(r) of a neighborhood of radius 
ortion to , and hence one can cal- 



r increases m pro 
culate D from 



jrop c 

HI 



D = 



d log V rA{r) 
dlogr V{r) 



(24) 



where A{r) is the surface area of the neighborhood, as 
previously. We can use the same expression to calculate 
the effective dimension of our small- world graphs. Thus 
in the case of an underlying lattice of dimension d = I, 
the effective dimension of the graph is 



(25) 



where we have made use of Eqs. (||) and (||). For r ^ ^ 
this tends to one, as we would expect, and for r ^ ^ it 
tends to 4r/^, increasing linearly with the radius of the 
neighborhood. Thus the effective dimension of a small- 
world graph depends on the length-scale on which we look 
at it, in a way reminiscent of the behavior of multifractals 
1^,^. This result will become important in Section VI 
when we consider site percolation on small- world graphs. 

In Fig. ^ we show the effective dimension of neighbor- 
hoods on a large graph measured in numerical simula- 
tions (circles), along with the analytic result, Eq. ( |25| ) 
(solid line). As we can see from the figure, the numerical 
and analytic results are in good agreement for small radii 
r, but the numerical results fall off sharply for larger r. 
The reason for this is that Eq. ( [2^ ) breaks down as V{r) 
approaches the volume of the entire system; V(r) must 
tend to L'^ in this limit and hence the derivative in ( |2^ ) 
tends to zero. The same effect is also seen if one tries to 
use Eq. (HJ) on ordinary regular lattices of finite size. To 
characterize the dimension of an entire system therefore, 
we use another measure of D as follows. 

On a regular lattice of finite linear size £, the number 
of vertices N scales as £^ and hence we can calculate the 
dimension from 



D = 



dlogiV 
dlog^ ■ 



(26) 



We can apply the same formula to the calculation of 
the effective dimension of small-world graphs putting 
A'' = L'^, although, since we don't have an analytic so- 
lution for £, we cannot derive an analytic solution for D 
in this case. On the other hand, if we are in the scal- 
ing regime described in Section |lV| — the regime in which 
^ ^ 1 — then Eq. ( p^ applies, along with the limiting 
forms, Eqs. (|l^) and (p^. Substituting into (p6|), this 
gives us 
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FIG. 6. Effective dimension D of small-world graphs. The 
circles are results for D from numerical calculations on an 
L = 1 000 000 system with d = 1, fc = 1 and = 10"^ using 
Eq. ([24I). The errors on the points are in all cases smaller 
than the points themselves. The solid line is Eq. (p^). The 
squares are calculated from Eq. ( |27| ) by numerical differen- 
tiation of simulation results for the scaling function f{x) of 
one- dimensional systems. The dotted line is Eq. ( p^ evalu- 
ated using the third-order Fade approximant to the scaling 
function derived in Section Inset: effective dimension 

from Eq. ( |27| plotted as a function of the scaling variable x. 
The dotted lines represent the asymptotic forms for large and 
small X discussed in the text. 



I _ dlog^ _ 1 
15 ^ dlogl^ ^ d 



dlog/(x) 



dloga; 



(27) 



where x = {(pkY^'^L oc L/^. In other words 13 is a uni- 
versal function of the scaling variable x. We know that 
f{x) tends to a constant for small x (i.e., ^ ^ L), so 
that D = d m this limit, as we would expect. For large 
X (i.e., ^ <^ L), Eq. (^ applies. Substituting into (|2^ ) 
this gives us D — dlogx. In the inset of Fig. @ we show 
D from numerical calculations as a function of x in one- 
dimensional systems of a variety of sizes, along with the 
expected asymptotic forms, which it follows reasonably 
closely. In the main figure we also show this second mea- 
sure of D (squares with error bars) as a function of the 
system radius £ (with which it should scale linearly for 
large £, since £ ~ log a: for large a;). As the figure shows, 
the two measures of effective dimension agree reasonably 
well. The numerical errors on the first measure, Eq. (24|) 
are much smaller than those on the second, Eq. (2q) 
(which is quite hard to calculate numerically), but the 
second measure is clearly preferable as a measure of the 
dimension of the entire system, since the first fails badly 
when r approaches £. We also show the value of our 
second measure of dimension calculated using the Fade 
approximant to f{x) derived in Section [tv] (dotted line 
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in the main figure). This agrees well with the numerical 
evaluation for radii up to about 1000 and has significantly 
smaller statistical error, but overestimates D somewhat 
beyond this point because of inaccuracies in the approx- 
imation; the Padc approximant scales as 1/x for large 
values of x rather than \ogx/x, which means that D will 
scale as x rather than logx for large x. 

VI. PERCOLATION 

In the previous sections of this paper we have exam- 
ined statistical properties of small-world graphs such as 
typical length-scales, vertex-vertex distances, scaling of 
volumes and areas, and effective dimension of graphs. 
These are essentially static properties of the networks; 
to the extent that small-world graphs mimic social net- 
works, these properties tell us about the static structure 
of those networks. However, social science also deals with 
dynamic processes going on within social networks, such 
as the spread of ideas, information, or diseases. This 
leads us to the consideration of dynamical models de- 
fined on small- world graphs. A small amount of research 
has already been conducted in this area. Watts l^j?) , for 
instance, has considered the properties of a number of 
simple dynamical systems defined on small-world graphs, 
such as networks of coupled oscillators and cellular au- 
tomata. Barrat and Weigt |Q have looked at the proper- 
ties of the Ising model on small-world graphs and derived 
a solution for its partition function using the replica trick. 
Monasson looked at the spectral properties of the 
Laplacian operator on small- world graphs, which tells us 
about the time evolution of a diffusive field on the graph. 
There is also a moderate body of work in the mathemat- 
ical and social sciences which, although not directly ad- 
dressing the small- world model, deals with general issues 
of information propagation in networks, such as the adop- 
tion of innovations p^-psj , human epidemiology [p6|-p8| , 
and the flow of data on the Internet l29|j30t] . 

In this section we discuss the modeling of informa- 
tion or disease propagation specifically on small-world 
graphs. Suppose for example that the vertices of a small- 
world graph represent individuals and the bonds between 
them represent physical contact by which a disease can be 
spread. The spread of ideas can be similarly modeled; the 
bonds then represent information connections between 
individuals which could include letters, telephone calls, 
or email, as well as physical contacts. The simplest model 
for the spread of disease is to have the disease spread be- 
tween neighbors on the graph at a uniform rate, starting 
from some initial carrier individual. From the results of 
Section |^ we already know what this will look like. If for 
example we wish to know how many people in total have 
contracted a disease, that number is just equal to the 
number V{r) within some radius r of the initial carrier, 
where r increases linearly with time. (We assume that no 
individual can catch the disease twice, which is the case 



with most common diseases.) Thus, Eq. (g) tells us that, 
for a d = 1 small- world graph, the number of individuals 
who have had a particular disease increases exponentially, 
with a time-constant governed by the typical length-scale 
^ of the graph. Since all real-world social networks have 
a finite number of vertices N , this exponential growth is 
expected to saturate when V{r) reaches N — L'^. This is 
not a particularly startling result; the usual model for the 
spread of epidemics is the logistic growth model, which 
shows initial exponential spread followed by saturation. 

For a disease like influenza, which spreads fast but is 
self-limiting, the number of people who are ill at any one 
time should be roughly proportional to the area A{r) of 
the neighborhood surrounding the initial carrier, with r 
again increasing linearly in time. This implies that the 
epidemic should have a single humped form with time, 
like the curves of A{r) plotted in Fig. ^. Note that the 
vertical axis in this flgure is logarithmic; on linear axes 
the curves are bell-shaped rather than quadratic. In the 
context of the spread of information or ideas, similar be- 
havior might be seen in the development of fads. By a fad 
we mean an idea which is catchy and therefore spreads 
fast, but which people tire of quickly. Fashions, jokes, 
toys, or buzzwords might be expected to show popular- 
ity proflles over time similar to the curves in Fig. ^. 

However, for most real diseases (or fads) this is not a 
very good model of how they spread. For real diseases 
it is commonly the case that only a certain fraction p of 
the population is susceptible to the disease. This can be 
mimicked in our model by placing a two-state variable 
on each vertex which denotes whether the individual at 
that vertex is susceptible. The disease then spreads only 
within the local "cluster" of connected susceptible ver- 
tices surrounding the initial carrier. One question which 
we can answer with such a model is how high the density 
p of susceptible individuals can be before the largest con- 
nected cluster covers a signiflcant fraction of the entire 
network and an epidemic ensues. 

Mathematically, this is precisely the problem of site 
percolation on a social network, at least in the case where 
the susceptible individuals are randomly distributed over 
the vertices. To the extent that small- world graphs mimic 
social networks, therefore, it is interesting to look at the 
percolation problem. The transition corresponds to the 
point on a regular lattice at which a percolating cluster 
forms whose size increases with the size L of the lattice 
for arbitrarily large L [ |3l[ |. On random graphs there is 
a similar transition, marked by the formation of a so- 
called "giant component" of connected vertices . On 
small-world graphs we can calculate approximately the 
percolation probability p — pc aX which the transition 
takes place as follows. 

Consider a d = 1 small-world graph of the kind pic- 
tured in Fig. |l|. For the moment let us ignore the short- 
cut bonds and consider the percolation properties just of 
the underlying regular lattice. If we color in a fraction p 
of the sites on this underlying lattice, the occupied sites 
will form a number of connected clusters. In order for 
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two adjacent parts of the lattice not to be connected, we 
must have a series of at least k consecutive unoccupied 
sites between them. The number n of such series can 
be calculated as follows. The probability that we have a 
series of k unoccupied sites starting at a particular site, 
followed by an occupied one is p{l — p)^ ■ Once we have 
such a series, the states of the next k sites are fixed and so 
it is not possible to have another such series for k steps. 
Thus the number n is given by 



^p{l~pf{L^kT 



Rearranging for n we get 



L 



p[l-pf 



1 + kp{l-pY' 



(28) 



(29) 



For this one-dimensional system, the percolation transi- 
tion occurs when we have just one break in the chain, 
i.e., when n = 1. This gives us a fcth order equation for 
Pc which is in general not exactly soluble, but we can find 
its roots numerically if we wish. 

Now consider what happens when we introduce short- 
cuts into the graph. The number of breaks n, Eq. (29), 



is also the number of connected clusters of occupied sites 
on the underlying lattice. Let us for the moment suppose 
that the size of each cluster can be approximated by the 
average cluster size. A number (pkL of shortcuts are now 
added to the graph between pairs of vertices chosen uni- 
formly at random. A fraction p^ of these will connect 
two occupied sites and therefore can connect together 
two clusters of occupied sites. The problem of when the 
percolation transition occurs is then precisely that of the 
formation of a giant component on an ordinary random 
graph with n vertices. It is known that such a component 
forms when the mean coordination number of the random 
graph is one [ p2| , or alternatively, when the number of 
bonds on the graph is a half the number of vertices. In 
other words, the transition probability Pc must satisfy 



p^(j)kL 



2 l + kp,{l~p,)^ 



2kp,[l + kp,{l - p,Y\ 



(30) 



(31) 



We have checked this result against numerical calcu- 
lations. In order to find the value of pc numerically, we 
employ a tree-based invasion algorithm similar to the in- 
vaded cluster algorithm used to find the percolation point 
in Ising systems [ ^3|j3^ ] . This algorithm can calculate the 
entire curve of average cluster size versus p in time which 
scales as LlogL Q. We define pc to be the point at 
which the average cluster size divided by L rises above a 
certain threshold. For systems of infinite size the tran- 
sition is instantaneous and hence the choice of threshold 
makes no difference to except that Pc can never take a 
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FIG. 7. Numerical results for the percolation threshold 
on L = 10 000 small- world graphs with k = 1 (circles), 
2 (squares), and 5 (triangles) as a function of the shortcut 
density 0. The solid lines are the analytic approximation to 
the same quantity, Eq. (pl|). 



value lower than the threshold itself, since even in a fully 
connected graph the average cluster size per vertex can 
be no greater than the fraction pc of occupied vertices. 
Thus it makes sense to choose the threshold as low as 
possible. In real calculations, however, we cannot use an 
infinitesimal threshold because of finite size effects. For 
the systems studied here we have found that a threshold 
of 0.2 works well. 

Fig. ^ shows the critical probability pc for systems of 
size L — 10 000 for a range of values of (/) for A; = 1, 2 
and 5. The points are the numerical results and the solid 
lines are Eq. (|l|). As the figure shows the agreement 
between simulation and theory is good although there 
are some differences. As (j) approaches one and the value 
of Pc drops, the two fail to agree because, as mentioned 
above, pc cannot take a value lower than the threshold 
used in its calculation, which was 0.2 in this case. The 
results also fail to agree for very low values of (f> where Pc 
becomes large. This is because Eq. ( p^ ) is not a correct 
expression for the number of clusters on the underlying 
lattice when n < 1. This is clear since when there are 
no breaks in the sequence of connected vertices around 
the ring it is not also true that there are no connected 
clusters. In fact there is still one cluster; the equality 
between number of breaks and number of clusters breaks 
down at n = 1. The value of p at which this happens is 
given by putting n = 1 in Eq. (28). Since p is close to 
one at this point its value is well approximated by 



p • 



(32) 



and this is the value at which the curves in Fig. I should 
roll off at low (p. For k — 5 for example, for which the 
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roll-off is most pronounced, this expression gives a value 
of p ~ 0.8, which agrees reasonably well with what we 
see in the figure. 

There is also an overall tendency in Fig. ^ for our an- 
alytic expression to overestimate the value of Pc slightly. 
This we put down to the approximation we made in the 
derivation of Eq. (|3|) that all clusters of vertices on the 
underlying lattice can be assumed to have the size of 
the average cluster. In actual fact, some clusters will 
be smaller than the average and some larger. Since the 
shortcuts will connect to clusters with probability pro- 
portional to the cluster size, we can expect percolation 
to set in within the subset of larger-than-average clusters 
before it would set in if all clusters had the average size. 
This makes the true value of pc slightly lower than that 
given by Eq. (|3l|). In general however, the equation gives 
a good guide to the behavior of the system. 

We have also examined numerically the behavior of 
the mean cluster radius p for percolation on small-world 
graphs. The radius of a cluster is defined as the aver- 
age distance between vertices within the cluster, along 
the edges of the graph within the cluster. This quantity 
is small for small values of the percolation probability p 
and increases with p as the clusters grow larger. When we 
reach percolation and a giant component forms it reaches 
a maximum value and then drops as p increases further. 
The drop happens because the percolating cluster is most 
filamentary when percolation has only just set in and so 
paths between vertices are at their longest. With further 
increases in p the cluster becomes more highly connected 
and the average shortest path between two vertices de- 
creases. 

By analogy with percolation on regular lattices we 
might expect the average cluster radius for a given value 
of (j) to satisfy the scaling form |^ 

p = F/^p((p-p,Ki/^), (33) 

where p{x) is a universal scaling function, £ is the radius 
of the entire system and 7 and v are critical exponents. 
In fact this scaling form is not precisely obeyed by the 
current system because the exponents v and 7 depend in 
general on the dimension of the lattice. As we showed in 
Section the dimension D of a small-world graph de- 
pends on the length-scale on which you look at it. Thus 
the value of D "felt" by a cluster of radius p will vary with 
p, implying that and 7 will vary both with the percola- 
tion probability and with the system size. If we restrict 
ourselves to a region sufficiently close to the percolation 
threshold, and to a sufficiently small range of values of £, 
then Eq. (p3[ ) should be approximately correct. 

In Fig. H we show numerical data for p for small-world 
graphs with fc — 1, = 0.1 and L equal to a power of 
two from 512 up to 16 384. As we can see, the data show 
the expected peaked form, with the peak in the region 
of p = 0.8, close to the expected position of the percola- 
tion transition. In order to perform a scaling collapse of 
these data we need first to extract a suitable value of Pc- 
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FIG. 8. Average cluster radius p as a function of the perco- 
lation probability p for site percolation on small- world graphs 
with fc = 1, (/) = 0.1 and L equal to a power from 512 up 
to 16 384 (circles, squares, diamonds, upward-pointing trian- 
gles, left-pointing triangles and downward-pointing triangles 
respectively). Each set of points is averaged over 100 real- 
izations of the corresponding graph. Inset: the same data 
collapsed according to Eq. (M) with u = 0.59, 7 = 1.3 and 
Pc = 0.74. 

We can do this by performing a fit to the positions of the 
peaks in p . Since the scaling function p{x) is (approx- 
imately) universal, the positions of these peaks all occur 
at the same value of the scaling variable y = {p — Pc)£^^'^ ■ 
Calling this value yo and the corresponding percolation 
probability po, we can rearrange for pq as a function of £ 
to get 

Po=Pc + 2/or'/". (34) 

Thus if we plot the measured positions po as a function 
of £~^l^ ^ the vertical-axis intercept should give us the 
corresponding value of pc- We have done this for a single 
value of V in the inset to Fig. ||, and in the main figure 
we show the resulting values of Pc as a function oiXjv. If 
we now perform our scaling collapse, with the restriction 
that the values of v and Pc fall on this line, then the best 
coincidence of the curves for p is obtained when pc = 0.74 
and V — 0.59 ± 0.05 — see the inset to Fig. ||. The value 
of 7 can be found separately by requiring the heights of 
the peaks to match up, which gives 7 — 1.3 ± 0.1. The 
collapse is noticeably poorer when we include systems 
of size smaller than L = 512, and we attribute this not 
merely to finite size corrections to the scaling form, but 
also to variation in the values of the exponents 7 and v 
with the effective dimension of the percolating cluster. 

The value Pc = 0.74 is in respectable agreement with 
the value of 0.82 from our direct numerical measure- 
ments. We note that v is expected to tend to \ in 
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FIG. 9. Best fit values of pc as a function of l/v. Inset: 
the values are calculated from the vertical-axis intercept of 
a plot of the position po of the peak of p against (see 
Eq. 

the limit of an infinite-dimensional system. The value 
V — 0.59 found here therefore confirms our contention 
that small-world graphs have a high effective dimension 
even for quite moderate values of 0, and thus are in 
some sense close to being random graphs. (On a two- 
dimensional lattice by contrast v ~ ^.) 

VII. CONCLUSIONS 

In this paper we have studied the small-world network 
model of Watts and Strogatz, which mimics the behav- 
ior of networks of social interactions. Small-world graphs 
consist of a set of vertices joined together in a regular lat- 
tice, plus a low density of "shortcuts" which link together 
pairs of vertices chosen at random. We have looked at 
the scaling properties of small-world graphs and argued 
that there is only one typical length-scale present other 
than the fundamental lattice constant, which we denote 
^ and which is roughly the typical distance between the 
ends of shortcuts. We have shown that this length-scale 
governs the transition of the average vertex-vertex dis- 
tance on a graph from linear to logarithmic scaling with 
increasing system size, as well as the rate of growth of 
the number of vertices in a neighborhood of fixed radius 
about a given point. We have also shown that the value 
of ^ diverges on an infinite lattice as the density of short- 
cuts tends to zero, and therefore that the system pos- 
sesses a continuous phase transition in this limit. Close 
to the phase transition, where ^ is large, we have shown 
that the average vertex-vertex distance on a finite graph 
obeys a simple scaling form and in any given dimension 
is a universal function of a single scaling variable which 



depends on the density of shortcuts, the system size and 
the average coordination number of the graph. We have 
calculated the form of the scaling function to fifth order 
in the shortcut density using a series expansion and to 
third order using a Pade approximant. We have defined 
two measures of the effective dimension D of small-world 
graphs and find that the value of D depends on the scale 
on which you look at the graph in a manner reminiscent 
of the behavior of multifractals. Specifically, at length- 
scales shorter than ^ the dimension of the graph is simply 
that of the underlying lattice on which it is built, and for 
length-scales larger than ^ it increases linearly, with a 
characteristic constant proportional to ^. The value of D 
increases logarithmically with the number of vertices in 
the graph. We have checked all of these results by exten- 
sive numerical simulation of the model and in all cases 
we find good agreement between the analytic predictions 
and the simulation results. 

In the last part of the paper we have looked at site per- 
colation on small- world graphs as a model of the spread 
of information or disease in social networks. We have 
derived an approximate analytic expression for the per- 
colation probability pc at which a "giant component" of 
connected vertices forms on the graph and shown that 
this agrees well with numerical simulations. We have also 
performed extensive numerical measurements of the typi- 
cal radius of connected clusters on the graph as a function 
of the percolation probability and shown by performing 
a scaling collapse that these obey, to a reasonable ap- 
proximation, the expected scaling form in the vicinity of 
the percolation transition. The characteristic exponent 
u takes a value close to |, indicating that, as far as per- 
colation is concerned, the graph's properties are close to 
those of a random graph. 
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